83 research outputs found

    On the Structure of Bispecial Sturmian Words

    Full text link
    A balanced word is one in which any two factors of the same length contain the same number of each letter of the alphabet up to one. Finite binary balanced words are called Sturmian words. A Sturmian word is bispecial if it can be extended to the left and to the right with both letters remaining a Sturmian word. There is a deep relation between bispecial Sturmian words and Christoffel words, that are the digital approximations of Euclidean segments in the plane. In 1997, J. Berstel and A. de Luca proved that \emph{palindromic} bispecial Sturmian words are precisely the maximal internal factors of \emph{primitive} Christoffel words. We extend this result by showing that bispecial Sturmian words are precisely the maximal internal factors of \emph{all} Christoffel words. Our characterization allows us to give an enumerative formula for bispecial Sturmian words. We also investigate the minimal forbidden words for the language of Sturmian words.Comment: arXiv admin note: substantial text overlap with arXiv:1204.167

    Factorizations of the Fibonacci Infinite Word

    Get PDF
    The aim of this note is to survey the factorizations of the Fibonacci infinite word that make use of the Fibonacci words and other related words, and to show that all these factorizations can be easily derived in sequence starting from elementary properties of the Fibonacci numbers

    Vertical representation of CC^{\infty}-words

    Full text link
    We present a new framework for dealing with CC^{\infty}-words, based on their left and right frontiers. This allows us to give a compact representation of them, and to describe the set of CC^{\infty}-words through an infinite directed acyclic graph GG. This graph is defined by a map acting on the frontiers of CC^{\infty}-words. We show that this map can be defined recursively and with no explicit references to CC^{\infty}-words. We then show that some important conjectures on CC^{\infty}-words follow from analogous statements on the structure of the graph GG.Comment: Published in Theoretical Computer Scienc

    On the least number of palindromes contained in an infinite word

    Full text link
    We investigate the least number of palindromic factors in an infinite word. We first consider general alphabets, and give answers to this problem for periodic and non-periodic words, closed or not under reversal of factors. We then investigate the same problem when the alphabet has size two.Comment: Accepted for publication in Theoretical Computer Scienc

    Open and Closed Prefixes of Sturmian Words

    Get PDF
    A word is closed if it contains a proper factor that occurs both as a prefix and as a suffix but does not have internal occurrences, otherwise it is open. We deal with the sequence of open and closed prefixes of Sturmian words and prove that this sequence characterizes every finite or infinite Sturmian word up to isomorphisms of the alphabet. We then characterize the combinatorial structure of the sequence of open and closed prefixes of standard Sturmian words. We prove that every standard Sturmian word, after swapping its first letter, can be written as an infinite product of squares of reversed standard words.Comment: To appear in WORDS 2013 proceeding

    A Classification of Trapezoidal Words

    Get PDF
    Trapezoidal words are finite words having at most n+1 distinct factors of length n, for every n>=0. They encompass finite Sturmian words. We distinguish trapezoidal words into two disjoint subsets: open and closed trapezoidal words. A trapezoidal word is closed if its longest repeated prefix has exactly two occurrences in the word, the second one being a suffix of the word. Otherwise it is open. We show that open trapezoidal words are all primitive and that closed trapezoidal words are all Sturmian. We then show that trapezoidal palindromes are closed (and therefore Sturmian). This allows us to characterize the special factors of Sturmian palindromes. We end with several open problems.Comment: In Proceedings WORDS 2011, arXiv:1108.341

    Abelian-Square-Rich Words

    Full text link
    An abelian square is the concatenation of two words that are anagrams of one another. A word of length nn can contain at most Θ(n2)\Theta(n^2) distinct factors, and there exist words of length nn containing Θ(n2)\Theta(n^2) distinct abelian-square factors, that is, distinct factors that are abelian squares. This motivates us to study infinite words such that the number of distinct abelian-square factors of length nn grows quadratically with nn. More precisely, we say that an infinite word ww is {\it abelian-square-rich} if, for every nn, every factor of ww of length nn contains, on average, a number of distinct abelian-square factors that is quadratic in nn; and {\it uniformly abelian-square-rich} if every factor of ww contains a number of distinct abelian-square factors that is proportional to the square of its length. Of course, if a word is uniformly abelian-square-rich, then it is abelian-square-rich, but we show that the converse is not true in general. We prove that the Thue-Morse word is uniformly abelian-square-rich and that the function counting the number of distinct abelian-square factors of length 2n2n of the Thue-Morse word is 22-regular. As for Sturmian words, we prove that a Sturmian word sαs_{\alpha} of angle α\alpha is uniformly abelian-square-rich if and only if the irrational α\alpha has bounded partial quotients, that is, if and only if sαs_{\alpha} has bounded exponent.Comment: To appear in Theoretical Computer Science. Corrected a flaw in the proof of Proposition

    On the Minimal Uncompletable Word Problem

    Full text link
    Let S be a finite set of words over an alphabet Sigma. The set S is said to be complete if every word w over the alphabet Sigma is a factor of some element of S*, i.e. w belongs to Fact(S*). Otherwise if S is not complete, we are interested in finding bounds on the minimal length of words in Sigma* which are not elements of Fact(S*) in terms of the maximal length of words in S.Comment: 5 pages; added references, corrected typo

    The sequence of open and closed prefixes of a Sturmian word

    Full text link
    A finite word is closed if it contains a factor that occurs both as a prefix and as a suffix but does not have internal occurrences, otherwise it is open. We are interested in the {\it oc-sequence} of a word, which is the binary sequence whose nn-th element is 00 if the prefix of length nn of the word is open, or 11 if it is closed. We exhibit results showing that this sequence is deeply related to the combinatorial and periodic structure of a word. In the case of Sturmian words, we show that these are uniquely determined (up to renaming letters) by their oc-sequence. Moreover, we prove that the class of finite Sturmian words is a maximal element with this property in the class of binary factorial languages. We then discuss several aspects of Sturmian words that can be expressed through this sequence. Finally, we provide a linear-time algorithm that computes the oc-sequence of a finite word, and a linear-time algorithm that reconstructs a finite Sturmian word from its oc-sequence.Comment: Published in Advances in Applied Mathematics. Journal version of arXiv:1306.225

    Binary Jumbled String Matching for Highly Run-Length Compressible Texts

    Full text link
    The Binary Jumbled String Matching problem is defined as: Given a string ss over {a,b}\{a,b\} of length nn and a query (x,y)(x,y), with x,yx,y non-negative integers, decide whether ss has a substring tt with exactly xx aa's and yy bb's. Previous solutions created an index of size O(n) in a pre-processing step, which was then used to answer queries in constant time. The fastest algorithms for construction of this index have running time O(n2/logn)O(n^2/\log n) [Burcsi et al., FUN 2010; Moosa and Rahman, IPL 2010], or O(n2/log2n)O(n^2/\log^2 n) in the word-RAM model [Moosa and Rahman, JDA 2012]. We propose an index constructed directly from the run-length encoding of ss. The construction time of our index is O(n+ρ2logρ)O(n+\rho^2\log \rho), where O(n) is the time for computing the run-length encoding of ss and ρ\rho is the length of this encoding---this is no worse than previous solutions if ρ=O(n/logn)\rho = O(n/\log n) and better if ρ=o(n/logn)\rho = o(n/\log n). Our index LL can be queried in O(logρ)O(\log \rho) time. While L=O(min(n,ρ2))|L|= O(\min(n, \rho^{2})) in the worst case, preliminary investigations have indicated that L|L| may often be close to ρ\rho. Furthermore, the algorithm for constructing the index is conceptually simple and easy to implement. In an attempt to shed light on the structure and size of our index, we characterize it in terms of the prefix normal forms of ss introduced in [Fici and Lipt\'ak, DLT 2011].Comment: v2: only small cosmetic changes; v3: new title, weakened conjectures on size of Corner Index (we no longer conjecture it to be always linear in size of RLE); removed experimental part on random strings (these are valid but limited in their predictive power w.r.t. general strings); v3 published in IP
    corecore